6 research outputs found

    A hierarchical, fuzzy inference approach to data filtration and feature prioritization in the connected manufacturing enterprise

    Get PDF
    In manufacturing, the technology to capture and store large volumes of data developed earlier and faster than corresponding capabilities to analyze, interpret, and apply it. The result for many manufacturers is a collection of unanalyzed data and uncertainty with respect to where to begin. This paper examines big data as both an enabler and a challenge for the connected manufacturing enterprise and presents a framework that sequentially tests and selects independent variables for training applied machine learning models. Unsuitable features are discarded, and each remaining feature receives a crisp numeric output and a linguistic label, both of which are measures of the feature’s suitability. The framework is tested using three datasets employing time series, binary, and continuous input data. Results of filtered models are compared to results obtained by base, unfiltered sets of features using a proposed metric of performance-size ratio. Framework results outperform base feature sets in all tested cases, and the proposed future research will be to implement it in a case study in the electronic assembly manufacture

    A Survey of Feature Set Reduction Approaches for Predictive Analytics Models in the Connected Manufacturing Enterprise

    No full text
    The broad context of this literature review is the connected manufacturing enterprise, characterized by a data environment such that the size, structure and variety of information strain the capability of traditional software and database tools to effectively capture, store, manage and analyze it. This paper surveys and discusses representative examples of existing research into approaches for feature set reduction in the big data environment, focusing on three contexts: general industrial applications; specific industrial applications such as fault detection or fault prediction; and data reduction. The conclusion from this review is that there is room for research into frameworks or approaches to feature filtration and prioritization, specifically with respect to providing quantitative or qualitative information about the individual features in the dataset that can be used to rank features against each other. A byproduct of this gap is a tendency for analysts not to holistically generalize results beyond the specific problem of interest, and, related, for manufacturers to possess only limited knowledge of the relative value of smart manufacturing data collected

    A hierarchical, fuzzy inference approach to data filtration and feature prioritization in the connected manufacturing enterprise

    Get PDF
    Abstract In manufacturing, the technology to capture and store large volumes of data developed earlier and faster than corresponding capabilities to analyze, interpret, and apply it. The result for many manufacturers is a collection of unanalyzed data and uncertainty with respect to where to begin. This paper examines big data as both an enabler and a challenge for the connected manufacturing enterprise and presents a framework that sequentially tests and selects independent variables for training applied machine learning models. Unsuitable features are discarded, and each remaining feature receives a crisp numeric output and a linguistic label, both of which are measures of the feature’s suitability. The framework is tested using three datasets employing time series, binary, and continuous input data. Results of filtered models are compared to results obtained by base, unfiltered sets of features using a proposed metric of performance-size ratio. Framework results outperform base feature sets in all tested cases, and the proposed future research will be to implement it in a case study in the electronic assembly manufacture

    Predicting Contact-without-connection Defects on Printed Circuit Boards Employing Ball Grid Array Package Types: A Data Analytics Case Study in the Smart Manufacturing Environment

    No full text
    This research presents an exploratory data analytics case study in defect prediction on printed circuit boards (PCB) employing ball grid array (BGA) package types during assembly. BGA package types are of interest because defects are difficult to identify and costly to rework. While much of the existing research is dedicated to techniques to identify and diagnose BGA defects, this research attempts to preempt them by using parametric data measured by solder paste inspection (SPI) machines as input data to applied machine learning models. Two modeling approaches are explored: one approach to analyze individual solder paste deposits and the other approach to holistically analyze all solder paste deposits on a single PCB location. The latter approach employs feature generation to extract a broad set of features from the arrays of SPI data and feature selection techniques for dimensionality reduction. Models trained on the reduced feature sets provide encouraging initial results, with precision, recall, and f1 score metrics exceeding 0.82, 0.50, and 0.62 respectively for each of two datasets analyzed

    Simulation Analysis of Applicant Scheduling and Processing Alternatives at a Military Entrance Processing Station

    No full text
    Eligibility for enlistment into the US military is assessed by the United States Military Entrance Processing Command (USMEPCOM), an independent agency that reports to the Office of the Secretary of Defense (OSD) and not to any specific branch of military service. This research develops a discrete-event simulation for applicant processing operations at a Military Entrance Processing Station (MEPS) to investigate the viability of potential alternatives to the current applicant arrival and processing operation. Currently, all applicants arrive to the MEPS at the beginning of the processing day in a single batch. This research models and compares two alternatives with the status quo: split-shift processing, by which applicant arrivals occur in two batches: one at 06:00 and one at 11:00 and appointment-based processing, by which applicants may arrive during one of three, four, six, or eight appointment windows. Express-lane processing is also explored, in which applicants are allowed to bypass select processing stations. Experimental results indicate that split-shift processing is not viable under the current processing model due to an unacceptable decrease in applicant throughput. Results from appointment-based scenarios are mixed, with the critical factors being the time between appointment batches and their associated arrival times

    Social Network Analysis of Twitter Interactions: A Directed Multilayer Network Approach

    No full text
    Effective employment of social media for any social influence outcome requires a detailed understanding of the target audience. Social media provides a rich repository of self-reported information that provides insight regarding the sentiments and implied priorities of an online population. Using Social Network Analysis, this research models user interactions on Twitter as a weighted, directed network. Topic modeling through Latent Dirichlet Allocation identifies the topics of discussion in Tweets, which this study uses to induce a directed multilayer network wherein users (in one layer) are connected to the conversations and topics (in a second layer) in which they have participated, with inter-layer connections representing user participation in conversations. Analysis of the resulting network identifies both influential users and highly connected groups of individuals, informing an understanding of group dynamics and individual connectivity. The results demonstrate that the generation of a topically-focused social network to represent conversations yields more robust findings regarding influential users, particularly when analysts collect Tweets from a variety of discussions through more general search queries. Within the analysis, PageRank performed best among four measures used to rank individual influence within this problem context. In contrast, the results of applying both the Greedy Modular Algorithm and the Leiden Algorithm to identify communities were mixed; each method yielded valuable insights, but neither technique was uniformly superior. The demonstrated four-step process is readily replicable, and an interested user can automate the process with relatively low effort or expense
    corecore